Transactional Support in MapReduce for Speculative Parallelism

نویسندگان

  • Naresh Rapolu
  • Karthik Kambatla
  • Suresh Jagannathan
  • Ananth Grama
چکیده

MapReduce has emerged as a popular programming model for large-scale distributed computing. Its framework enforces strict synchronization between successive map and reduce phases and limited data-sharing within a phase. Use of keyvalue based persistent storage with MapReduce presents intriguing opportunities and challenges. These challenges relate primarily to semantic inconsistencies arising from the different fault-tolerant mechanisms employed by the execution environment and the underlying storage medium. We define formal transactional semantics for MapReduce over reliable key-value stores. With minimal performance overhead and no increase in program complexity, our solutions support broad classes of distributed applications hitherto infeasible in MapReduce. Specifically, this paper (i) motivates the use of key-value stores as the underlying storage for MapReduce, (ii) defines transactional semantics for MapReduce to address any inconsistencies, (iii) demonstrates broader application scope enabled by data sharing within and across jobs, and (iv) presents a detailed evaluation demonstrating the low overhead of our proposed semantics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TransMR: Data-Centric Programming Beyond Data Parallelism

MapReduce and related data-centric programming models have proven to be effective for a variety of large-scale distributed computations, in particular, those that manifest data parallelism. The fault-tolerance model underlying these programming environments relies on deterministic replay, which makes data-sharing (side-effects) across computations harder to support. This significantly limits th...

متن کامل

Exploring Speculative Parallelism in Spec2006 Exploring Speculative Parallelism in Spec2006 Exploring Speculative Parallelism in Spec2006

Computer industry has adopted multi-threaded and multi-core architectures as the clock rate increase stalled in early 2000’s. It was hoped that the continuous improvement of single-program performance could be achieved through these architectures. However, traditional parallelizing compilers often fail to effectively parallelize general-purpose applications which typically have complex control ...

متن کامل

Improving Continuation-Powered Method-Level Speculation for JVM Applications

Most applications running on the Java Virtual Machine (JVM) make extensive use of dynamic object-oriented programming features such as inheritance, polymorphism, and encapsulation. This makes them very hard or even impossible to analyze statically, defeating most of the automatic parallelization research done so far for traditional computeheavy scientific applications. In this paper, we propose...

متن کامل

Automatic Tuning of the Parallelism Degree in Hardware Transactional Memory

Transactional Memory (TM) is an emerging paradigm that promises to ease the development of parallel applications. Due to its inherently speculative nature, however, TM can suffer of performance degradations in presence of conflict intensive workloads. A key technique to tackle this issue consists in dynamically regulating the number of concurrent threads, which allows for selecting the concurre...

متن کامل

Speculative Concurrent Processing with Transactional Memory in the Actor Model

The actor model has been successfully used for scalable computing in distributed systems. Actors are objects with a local state, which can only be modified by the exchange of messages. One of the fundamental principles of actor models is to guarantee sequential message processing, which avoids typical concurrency hazards, but limits the achievable message throughput. Preserving the sequential s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010